All reversible dynamics in maximally non-local theories are trivial 
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A remarkable feature of quantum theory is non-locality (i.e. the presence of correlations which violate Bell 
inequalities). However, quantum correlations are not maximally non-local, and it is natural to ask whether there 
are compelling reasons for rejecting theories in which stronger violations are possible. To shed light on this 
question, we consider post-quantum theories in which maximally non-local states (non-local boxes) occur. It 
has previously been conjectured that the set of dynamical transformations possible in such theories is severely 
limited. We settle the question affirmatively in the case of reversible dynamics, by completely characterizing 
all such transformations allowed in this setting. We find that the dynamical group is trivial, in the sense that it 
is generated solely by local operations and permutations of systems. In particular, no correlations can ever be 
created; non-local boxes cannot be prepared from product states (in other words, no analogues of entangling 
unitary operations exist), and classical computers can efficiently simulate all such processes. 
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Introduction. — Quantum mechanics exhibits the remark- 
able feature of non-local correlations, as highlighted in Bell's 
seminal paper |Q]]. Such correlations have (up to a few remain- 
ing loopholes) been extensively verified in experiments 10]. 

Aside from their theoretical importance, non-local corre- 
lations can be exploited for technological use: they are vital 
in entanglement-based quantum key distribution schemes JU, 
for example, where their presence can be used to guarantee 
security (see also Q for a recent review). 

While quantum mechanics violates Bell inequalities, it does 
not do so in the maximal possible way. There are conceiv- 
able devices, so-called non-local or Popescu-Rohrlich boxes, 
that permit even stronger correlations than quantum mechan- 
ics does, while respecting the no-signalling principle JH-0]. 
Such correlations are not observed in nature and the question 
arises as to whether other fundamental principles might be vi- 
olated if they were to exist. 

There has already been some progress towards answering 
this question. For example, the existence of non-local boxes 
would lead to some communication complexity problems be- 
coming trivial JHHt], the possibility of oblivious transfer ifioll 
and the lack of so-called information causality Hill . It has also 
been realized that in a theory in which maximally Bell violat- 
ing correlations emerge, the set of possible dynamical trans- 
formations would be severely restricted compared to those al- 
lowed in quantum theory lfl2ll . While a complete classification 
of the dynamics has remained elusive, it has been shown, for 
example, that entanglement swapping is impossible lfl3l[l4ll . 
Furthermore, the question of the computational power of such 
a theory has been raised HHQj]]. 

We work in the framework of generalized probabilistic the- 
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FIG. 1: Two-dimensional caricature of the (normalized) boxworld 
state space formed by stellating a square. Local vertices are denoted 
by L and non-local ones by NL. No symmetries of this object take L 
states to NL states or vice versa. 



ories lfl2l [l5l - [l7tl . adopting the pragmatic operational view 
that the physical content of a theory is in the predicted statis- 
tics of measurement outcomes given preparations and trans- 
formations. The framework makes minimal assumptions and 
allows for mathematical rigour. We consider a system com- 
posed of N subsystems. To each subsystem one of M > 1 
measurements may be applied, yielding one of K > 2 out- 
comes (in the following, unless otherwise stated, we assume 
each subsystem has the same M and K). The state space 
contains all non-signalling correlations, corresponding to so- 
called generalized non- signalling theory 11211 or, more collo- 
quially, boxworld. 

Our main result (Theorem 1) is that (except in the case 
M = 1 which corresponds to classical theory) the set of re- 
versible transformations in boxworld is trivial: all such opera- 
tions are a combination of local operations on a single system 
(which correspond to relabellings of measurements and their 
outcomes) and permutations of local systems (which corre- 
spond to relabellings of subsystems). This solves the afore- 
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mentioned open problem concerning the computational power 
of boxworld in the case of reversible dynamics HHGHl- 

Another interesting consequence is that, in boxworld, mea- 
surements and dynamics are necessarily distinct physical pro- 
cesses, in the sense that a measurement cannot be seen as a 
reversible dynamics on the system comprising the state and 
measurement device (cf. quantum theory, where the measure- 
ment process can be seen as a unitary evolution from the point 
of view of an external observer). We discuss this further in the 
final section. 

We note that, in the case of a classical-boxworld hybrid 
system, Theorem 1 does not hold — we give an example of a 
CNOT operation on this system at the end of the paper. How- 
ever, for all types of system, including those where the number 
of measurements and outcomes differs among the subsystems, 
reversible dynamics map pure product states to pure product 
states — that is, non-local states cannot be reversibly prepared 
from product states. This is our second main result (Theo- 
rem 2). 

A geometric intuition behind this result is as follows. The 
state space of the theory is a convex polytope, and reversible 
transformations must map it to itself. They therefore corre- 
spond to symmetries of the polytope. The polytope is in some 
way stellated, with the vertices corresponding to maximally 
non-local states having a different character from local ones. 
They are hence not connected by symmetries of the polytope. 
A two-dimensional caricature is shown in FigureQ] 

The presentation proceeds as follows. We begin by for- 
mally introducing boxworld, then proceed to give the math- 
ematical framework we will work with. This is essentially 
the standard generalized probabilistic framework, as used 
in lfl2l [l5l4l7ll . For clarity of exposition, in the main text 
we restrict to the case of two binary measurements (M = 2, 
K = 2) and give proofs of the main theorems for this case. 
The general case is deferred to the appendix, where the proofs 
are slightly more complicated but analogous. 

Boxworld. — Recall that we have a system comprising N 
subsystems and, on each subsystem, one of M possible mea- 
surements can be applied (corresponding to different measure- 
ment devices), yielding one of K possible outcomes (in the 
most general case, K depends on the measurement). The 
local measurements are denoted {Xq,Xi, , . , ,Xm-i}- A 
measurement on the entire system made up of local measure- 
ments can then be described by a string A\ . . .An, where 
Ai E {Xo,Xi, . . . ,Xm-i} specifies the measurement ap- 
plied to the zth subsystem. Similarly, the corresponding out- 
comes are denoted a\ . . . cln, with G {0, 1, . . . , K — 1}. 
Measurement-outcome pairs are called effects, e.g. a measure- 
ment of X\ giving outcome 3. A state is then a function 
P : (ai . . . ajv|^4i ■ ■ • An) h> [0, 1], which gives the prob- 
ability of the effect that Ai ... An is measured and gives out- 
comes ai . . . ojy. More general measurements are possible: 
a measurement is a collection of effects for which the sum 
of the outcome probabilities over the collection is 1 when 
acting on any state. Such measurements include procedures 
whereby the measurement performed on a particular subsys- 
tem depends on the outcomes of previous measurements, con- 



vex combinations of such procedures and more fTill . How- 
ever, the statistics of the local measurements A\ . . . An are 
sufficient to uniquely determine the outcome probabilities of 
all measurements, and hence can be used to specify the state. 
This non-trivial assumption is known as the local observabil- 
ity principle lfl8tl . 

Furthermore, the subsystems can be spatially separated, and 
hence we require that P satisfies the non-signalling condi- 
tions, i.e. that 

K-l 

^ P(ai, ...,a,i,.. . ,a N \A 1; . . . , A i: . . . , A N ) (1) 

is independent of Aj. This implies that the marginal distribu- 
tion on some set of subsystems is independent of the choice 
of measurement(s) on other subsystems. 

Boxworld is a physical theory whose state space consists 
of any P subject to: (i) P takes values in [0, 1]; (ii) P is nor- 
malized in the obvious sense; and (Hi) P satisfies the non- 
signalling conditions ([T). The constraints (i) - (Hi) are such 
that the state space is a convex polytope, which turns out to 
have a non-trivial structure. 

We first deal with the special case M = K = 2 (the case 
of so-called gbits lfl2ll ). The corresponding state spaces (de- 
fined below) contain interesting non-local states, for example, 
non-local boxes with maximal Bell violating correlations. We 
label the measurements Xq = X and X\ = Z. 



Mathematical Framework. — We work in the generalized 
probabilistic framework (see e.g. lfl2l [l5l - [l7ll ). Here, states 
are represented as vectors embedded in a real vector space. 
Effects will also be represented as vectors, such that the prob- 
abilities of outcomes will be given by inner products between 
the relevant vectors. We begin with the case of a single sys- 
tem (N = 1). We choose three linearly independent vectors 
X,Z,l€ R 3 . The vector X is identified with (1|X), which 
is the effect that the X measurement gives outcome 1. We de- 
fine a vector ^A" := 1 — X and associate it with (0|X). The 
prefix -i may be interpreted as a negation. Lastly, the —*Z ef- 
fect is defined analogously as -^Z := 1 — Z. Because X, Z, 1 
are linearly independent, for every state P, there is a unique 
vector s € R 3 representing P in the sense that 

(X,s) = P(l\X), (Z,s) =P(l\Z), (l,.s)=l. 

It foUows that (-,X, s) = P(0\X) and likewise for Z. We 
will refer to the set = {X, ->X, Z, -^Z} as the single- 
site extremal effects, for reasons that will become clear be- 
low. (Note that the quantum analogue of our effect vectors are 
projectors, and the inner product is analogous to the Hilbert- 
Schmidt scalar product, mapping states, p, and projectors, n, 
to probabilities, Tr(pH).) 

The A-subsystem extremal effects P^ are defined to be 
the tensor products A\ ® • • • 65 An, where Ai € (the 
reason for this definition is that it recovers the full set of non- 
signalling distributions for the state space, as will be shown in 
Lemma 1). We further define the identity on sites, := 
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1®. . .® LA central object is the convex cone JC( N *> generated 
by •pW ||19|l. This cone is the collection of all vectors which 
can be written as a linear combination of elements of p( N > 
with non-negative coefficients. For any convex cone JC, one 
can associate a dual cone JC* — {s | (A, s) > OVA G JC}. 
We will identify this with the set of unnormalized states. 

Our interest in cones and duality stems from the follow- 
ing lemma, which characterizes the state space of boxworld 
in terms of the cone K.( N K It also implies the well-known 
result that there are no entangled effects in boxworld. 

Lemma 1. — Let 5^' be the set of vectors s in the dual cone 
(JC^)* which satisfy (1 (JV) , s) = 1. The space of (normal- 
ized) states in boxworld can be represented by S^ N K 

Proof. We use the notation ~^°A := A and -^A := —>A for 
A G {X, Z}. The vectors s G S (Af) will henceforth be called 
states; they satisfy (1 (7V \ s) = 1 and (B, s) > for all B G 
K,( N >. To every state s, we associate a probability distribution 
P via 

P(oi,...,Oiv|i4i I ..., J 4j V ) := (-."Mi® ...®^ a »A N ,s) 

for Ai G {X, Z} and ai G {0,1}. First we show that every 
such P is a valid non-signalling probability distribution. By 
definition, P is non-negative. To see that it is normalized, note 
that we can decompose the identity 

jW= J2 ~< Xl Ai <g) ~^ X2 A 2 ® ... 65 -< XN A N (2) 

2;e{0,l} N 

for any choices of Ai G {X, Z}, so that 

^ P(a 1; . . . , o w |Ai, . . . , A N ) = (1.W s) = 1. 

a\ ,. . . 

To see that P is non-signalling, consider 

y^-P(ai, ■ • ■ ,0*, ■ • ■ ,ajv|Ai, . . . ,A N ) 

ai 

= {-^ ai A l ®...®A i ®...®^ aN A N ,s) 

+(^ ai A 1 ® . . . ® -.A* ® . . . <g) Ajy, s) 
= (^ ai Ax (8 . . . <£> 1 ® . . . <g> ^ ajv Ajv,s) 

which is independent of Aj. 

To show that every non-signalling distribution has an 
associated state, note that there is a unique vector 
s G (R 3 )® N such that (^ ai Ai <g> . . . <g> A N , s) = 
P(a 1 ,...,a N \A 1 ,...,A N ) for^'A, G {X,Z,^X} (since 
these effects span the space). It is then easy to see that the 
no-signalling property enforces consistency also in the case 
that -^ a * Ai = ^Z for some i. Non-negativity and normaliza- 
tion follow directly from the corresponding statements for the 
probability distribution P. □ 

Transformations. — We now consider transformations in 
boxworld. First note that all allowed dynamical transforma- 
tions in general probabilistic theories (reversible or not) are 
linear — this follows from the fact that they have to respect 



convex combinations, which correspond to probabilistic mix- 
tures. For a general proof of this fact see lfl2ll . 

The allowed transformations, T, are defined to be linear 
maps with the property that for all s G S^ N \ Ts G S^ N K 
A transformation is reversible if both T and T _1 are allowed 
transformations. It follows that a reversible transformation 
maps the state space 5^ bijectively onto itself. Furthermore, 
since T is a linear map, it is also the case that T maps extremal 
states to extremal states. (More generally, one would only 
consider a transformation allowed if T (g) 1 is also allowed — a 
condition analogous to complete positivity in quantum theory. 
However, our result applies without this additional require- 
ment.) 

Note that the states s G themselves do not have 

a physical meaning — only their scalar products with effects 
do, i.e. (A, s) (which are probabilities). Since (A, Ts) = 
(T'A,s), the dynamics may equivalently be specified by 
means of the adjoint map Tt. (In quantum theory, the ana- 
logue is passing from the Schrodinger to the Heisenberg pic- 
ture.) Reversible transformations, T, map the state space bi- 
jectively onto itself and, likewise, the adjoint transformations 
T T act accordingly on the cone of effects /CW. 

Lemma 2. — Adjoint reversible transformations map the 
cone of effects JC^ N ^ bijectively onto itself. Moreover, they 
map the set of extremal effects, T>\ N \ onto itself. 

Proof. Any vector t G KS N >* can be written t = As for some 
A > and some s G . Then, for any outcome B G JC^ N \ 
we have 

(T^B, t) = (B.Tt) = \{B,Ts) > 0, 

since Ts G S^ N \ From the definition of the dual cone, it 
follows that T^B G (JC^)** = /CW (note that JC** = JC 
for every closed convex cone JC (cf. UH)). Therefore, 
maps the cone of effects JC^ N > into itself. The same argument 
applies to the inverse (T^) -1 = (T -1 )^, hence the cone is 
mapped bijectively onto itself. 

Since it is a convex cone, K\ N ' is completely characterized 
by its extremal rays. By linearity, T' maps the extremal rays 
of Ks- N ' onto themselves. From the definition of tS N \ we 
know that the cone is the convex hull of the 4 N rays formed 
by all A G It is elementary to check that these are 

indeed the extremal rays. Therefore, for every A G V^ N \ 
there exists an A' G pW and a non-negative number A such 
that T'(A) = XA', To see that A must equal 1, observe that 
forevery B G V^ N >, there exist (product) states sq, s± G S^ n ^ 
such that (so,B) = and (si,B) = 1, Since this holds in 
particular for both A and A', it follows that A = 1. □ 

Ortlzogonal representation of transformations. — There are 
4^ extremal effects, and thus 4 W ! permutations acting on 
-pW g G on to s jj OW that only a tiny fraction of those 
is actually realizable in boxworld. It will be convenient to use 
a specific representation of X, Z and L - 

X = (1/2, 1/ A 0), Z= (1/2,0,1/72), 1 = (1,0,0). 

Lemma 3. — With respect to the representation above, it 
holds that any reversible transformation T is orthogonal, i.e. 
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on N subsystems, T^T = 1 3 jv, where 1^ is the d-dimensional 
identity matrix. 

Proof. First observe that with this choice, Xmgpu) I^X^I = 
I3 and hence (since T>( N > factorizes) XmppW \ A)(A\ = 
I3N. Then, since permutes the extremal effects, T^T = 
(E^W l^> (4) T = Ea g t>w \A) (A\ = 1 3 » ■ □ 

The fact that T (and thus T') is orthogonal, gives rise to 
a host of invariants. If one picks any two extremal effects 
Q,R G ~p( N \ then clearly their inner product is a conserved 
quantity: (Q,R) = (T^Q,T^R). However, \(Q,R)\ = 
4 -jv 3 JV-dir(Q,Ji) ) w here d H (Q, R) is the Hamming distance 
between Q and R, i.e. the number of places at which Q and 
R differ. Thus the Hamming distance of extremal effects is a 
conserved quantity: d H {Q,R) = d H {T^Q,T^ R). 

It is well-known in the theory of error correction ll20ll that 
the set of maps on finite strings which preserve the Hamming 
distance is highly restricted: the group of those maps is gen- 
erated by local transformations and permutations of sites only 
(for a proof, see the Appendix). Thus acts as such an op- 
eration on V^ N \ Moreover, since the states in span the 
entire space, the action on this set is sufficient to completely 
specify T>. 

Furthermore, it is straightforward to show that the set of 
allowed local operations comprises exchanging X and Z (re- 
labelling measurements), exchanging X and ^X (relabelling 
the outcome upon input X), exchanging Z and -^Z (rela- 
belling the outcome upon input Z) and combinations thereof 
(see Lemma [8] in the Appendix, for a proof in the general 
case). 

Main results. — Combining all the previous results proves 
the following theorem in the special case of M = 2 measure- 
ments with K = 2 outcomes (a full proof for all M > 2 and 
K > 2 is given in the appendix): 

Theorem 1. — Every reversible transformation on a system 
comprising N subsystems in boxworld, with M > 2 measure- 
ments at every subsystem each having K > 2 outcomes, is a 
permutation of subsystems, followed by local relabellings of 
measurements and their outcomes. 

Furthermore, we show the following: 

Theorem 2. — In boxworld, every reversible transformation 
maps pure product states to pure product states. This is true 
even if the system is coupled to an arbitrary number of classi- 
cal systems, and if the number of devices and outcomes varies 
from subsystem to subsystem. 

Before giving the proof, we need to slightly extend the no- 
tion of outcome vectors to the general case. We denote the set 
of extremal effects for the ith subsystem by V 1 = {X^k)}, 
where m labels the measurements (the number of different 
ms may depend on i) and k the corresponding outcomes (the 
number of different fcs may depend on m and on i). These vec- 
tors satisfy J^k -^m(^) = !*• wriere I 1 represents the identity. 
Except for these relations, no linear dependencies occur. 



The identity on the full system is then 1 (JV) := l 1 ®. . .®1 JV , 
and the extremal effects are J>( N > := V 1 ® ... ®V N . The 
convex cone 1Q- N > and the state space S^ N > are defined anal- 
ogously to the binary case previously described. The state- 
ments and proofs of Lemmas 1 and 2 remain valid in this more 
general case, hence, in particular, adjoint reversible transfor- 
mations map V {N) onto itself. 

Proof. To complete the proof of Theorem 2, note that a state 
s G <SW is a pure product state (that is, of the form s = 
si <£>...<£> SjVi where all s,; are pure) if and only if (A, s) G 
{0, 1} for all extremal effects A G V^ N > (a proof is given in 
Lemma |9). Suppose that s is a pure product state and T a 
reversible transformation, then 

(A, Ts) = (TU, s) € {0, 1} for all AeV {N \ 

which proves that Ts must also be a pure product state. □ 

Note that Theorem 1 does not, in general, apply to the case 
of site-dependent numbers of measurements. For example, 
suppose that we have two sites, where the first has two binary 
measurements, X and Z, and the second allows only a single 
binary measurement, Y. (In other words, a gbit is coupled 
to a classical bit.) It is then straightforward to construct a re- 
versible CNOT operation, where the classical bit is the control 
bit. For example, there is an adjoint reversible transformation 
that acts as 

A ® Y h> A ® Y, A <£> -nY -nA ® -Y 

for all A G {X, Z,^X,^Z}. 

In the case of a system composed of several classical sub- 
systems, Theorem 1 also does not hold — the dynamics in such 
a case is non-trivial. Nevertheless, Theorem 2 does apply to 
this case — it remains impossible to prepare entangled states 
from separable ones. 

Conclusions. — We have shown that the set of reversible op- 
erations in boxworld is trivial: the only possible operations 
relabel subsystems, local measurements and their outcomes. 
In particular, there is no boxworld analogue of an entangling 
unitary in quantum theory, one cannot reversibly prepare non- 
local states from separable ones, nor perform useful computa- 
tions reversibly. 

In addition, the results have consequences for the interplay 
between dynamics and measurements in boxworld: suppose 
we have a system comprising a particle, A, and two observers, 
B and C, initially in an uncorrected tripartite product state. In 
quantum theory, if B measures A, but C does not take part in 
the interaction, then C can model the corresponding dynamics 
by a unitary transformation on the yl_B-system. That is, C can 
view the whole interaction as reversible while retaining the 
ability to correctly predict the outcome probabilities of any 
future measurements. (Theories with such a property might 
be called fundamentally reversible.) In boxworld, on the other 
hand, this is not true: £Ts measurement on A would have to 
create correlations between A and B, but this could never be 
achieved by a reversible transformation. Hence C would have 
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to model the AB-measurement using irreversible dynamics, 
even if C did not take part in the interaction itself. 

It would be interesting to extend our result to explore which 
state spaces are compatible with fundamentally reversible the- 
ories in this sense, or with theories that are transitive, i.e. that 
every pure state can be reversibly mapped to any other. This 
pro perty has been used by Hardy as an axiom for quantum the- 
ory IU6I1 . Both conditions seem to strongly restrict the possible 
geometry of the state space, and an interesting open question 
is how non-local such theories can be. 
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Appendix 

This appendix contains the proof of Theorem 1 : Every re- 
versible transformation on a system comprising N subsystems 
in boxworld, with M > 2 measurements at every subsystem 
each having K > 2 outcomes, is a permutation of subsystems, 
followed by local operations. 

The proof idea is the same as in the case M = K = 2: find 
a particular representation of the vectors X m (k) and 1 (cor- 
responding to the previous vectors X, Z, -^X and -^Z and 1) 
such that reversible transformations are orthogonal, and such 
that the scalar products of those vectors yield useful invari- 
ants. We recall that the dual of reversible transformations pre- 
serve the cone of effects and so permute extremal effects (this 
is Lemma 2 applied to this case). 

We start with the following observation: 



{wi} 1 ^ 1 in R N with the properties 

• (wi,Wj) = -± ifi^j, 

N+l 

• Wi = 0, and 



N 



j JV+l 

— K)K 

i=l 



N 



L N . 



Proof. Rather than giving the vectors explicitly, we construct 
them implicitly from the standard TV-simplex in M w+1 : let 
&i be the zth standard unit vector in and c the cen- 

ter of those vectors, that is c := jAj X^S 1 e «- Define 
t'i := — c, so that the angles between those vectors (i ^ j) 

are n^y^n = - j?- b Y construction, we have J^iL^ v i = °> 
so the vectors are linearly dependent. The set {wi} 1 ^ 1 are 
then the vectors resulting from embedding normalized ver- 
sions of the v.;S isometrically into K . The first two claimed 
equalities follow immediately. The third can be confirmed by 



computing 



JV+l 



iV 



which involves only scalar products of the form (wi ,Wj). □ 

The vectors X m (k) representing the fcth measurement out- 
comes for the mth measurement (counting from zero) can be 
constructed as follows: 

• Choose 1^0 arbitrarily, 

• choose X m (0), X m (l), . . . , X m (K — 2) for all m such 
that all obtained vectors are linearly independent, 



• define X m (K — 1) as 1 



We choose these in a particular way in order to simplify the 
subsequent argument: the single-site effects will be vectors in 

R M(K-x)+i = ( R m g, R if-i) ffi jj. Let e m denote the mth 



Lemma 4. For every N G N, there exist unit vectors 



1 The Schatten 2-norm is defined by \\A\\% := tr(AA^). 
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standard unit vector in M. M , and 1 be the unit vector on the 
direct sum space, R. Then, define 



X m (k) := y A/ ^ 2 ~ e m+i ® Wfc+i + -^1 (3) 

for < m < M — 1 and < k < K — 1, Useful properties 
of these vectors are given in the following lemma. 

Lemma 5. In the representation given above, we have 

K-l 



x m {k) = i, 



(4) 



M-l K-l 



J2 J2 \X m (k))(X rn {k)\ = ¥-1, and (5) 



m=0 k=0 



(X m (k),X m ,(k')) = 



K 



1 m m' 

l-M m = m',k^k' 
1 + M(K - 1) m = rri, k = k' . 



Moreover, reversible transformations are orthogonal with re- 
spect to this representation. 

Proof. The three equations can be verified by direct calcula- 
tion. That reversible transformations are orthogonal is simply 
the extension of Lemma 3 to the present case. □ 

We remark that the inner product 1 — M for m = m',k ^ 
k' is the reason why Theorem 2 does not hold in the case of 
classical systems (M = 1). In the following, we will assume 
that M > 2. 

We now consider an adjoint reversible transformation, T\ 
Note that for all s g S {N \ 

1 = (lW,T«) = (TtlW,s), 

from which it follows that T^l^ = ]} N \ Moreover, we 
have the following property: 

Lemma 6. Let Q, R g be two extremal effects that dif- 
fer at exactly one site, and let T* be an adjoint reversible 
transformation. Then, T^Q and T' R also differ at exactly 
one site. 

Proof. Since Q and R factorize, we can compute the inner 
product (Q, R) termwise. Let i be the site where Q and R 
differ, and let Qi and Ri be the corresponding factors. 

First, consider the case that Qi and R, t represent different 
outcomes of the same measurement. Then, the inner product 
is the negative value 



K 2N (Q, R) = (l- M) (1 + M(K - 1)) 



N-l 



which is the smallest value that can possibly be attained. 
Hence K 2N (T^Q, T^R) has the same value, such that T^Q 
and T^R also differ at a single site only (where they refer to 
different outcomes of the same measurement). 

The alternative case is where Q t and Ri represent outcomes 
of different measurements. Note that 1 — Qi — Ri ^ JC and 



hence 1 (JV) -Q-R i KS N) . Since T* preserves 1 {N) and 
maps the cone bijectively to itself (cf. Lemma 2), we have 
1 W _ T }q _ T \ R £ /cW, from which it follows that T^Q 
and T^R correspond to outcomes of different measurements 
on at least one factor. Furthermore, 



K^iQ.R) = (1 + M(K-1)) 



N-l 



is preserved. This is the largest value that can be attained sub- 
ject to the constraint that they represent outcomes of different 
measurements on at least one factor. It follows that T^Q and 
T^R are identical in all but one tensor factor. □ 

The proof of Theorem 1 is now completed using some prop- 
erties of the Hamming distance. The list of local effects (list- 
ing the measurement-outcome pairs at the successive sites) 
can be used to form a string in Z^, where d = MK. The 
Hamming distance between two strings, Q and R, is defined 
by 

d H (Q,R) := |{* : Qi + Ri}\- 

Lemma [6] shows that if Q,R £ T>( N ) are two arbitrary ex- 
tremal effects with dn(Q, R) = 1, then the transformed ef- 
fects satisfy d H (T^Q, T^R) = 1. 

In fact, all reversible operations that preserve Hamming dis- 
tance 1 preserve the Hamming distance between all effects. 
Furthermore, the set of Hamming distance preserving trans- 
formations can be expressed as combinations of permutations 
of subsystems and local permutations (see for example Theo- 
rem 3.54 of Il20ll ). We give a proof of this for completeness. 

Lemma 7. Let A be a finite alphabet, with A N the set of 
length- N words. Further, let Gn — Sn be the set of permu- 
tations of letters and Gl = SK\ be the group of local trans- 
formations of A N , which act independently at each position. 

Assume that T' : A — > A N is invertible. If T' has 
the property that for all s, t g A , dn(s,t) = 1 ==> 
djj{T^ s,TH) = 1, then is a composition of operations 
from Gn and Gl- 

Proof. Choose an arbitrary set a, G A, for i = 1, . . . , N. Set 
s = (01, . . . , a at). Left-multiplying by a local operation if 
necessary, we may assume that T\s) = s. 

For i g 1 . . . N, consider the set, Li of words of the form 



Li = (ai, 



-ii Ai, a,_|_i, . . . , ajv), 



i.e. s and strings that differ from s only at position i. Because 
the elements of Li all have mutual Hamming distance equal 
to one, there must be a function n such that T*{Li) = L n uy 
Since T' is invertible, 7r is a permutation, which may be 
thought of as an element of Gn- Because (■k~ 1 )T^ (Li) = Li 
for all i, there is no loss of generality in assuming that 
takes Li to itself. Employing yet another local operation if 
necessary, we may even assume that T acts like identity on 
all elements of Li, and hence on all strings with Hamming 
distance 1 to s. 

Define the weight of an element t g A N to be wt(£) = 
dn(t, s). What we have shown so far amounts to the fact that 
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fixes all words of weight zero and one. Next, we prove by 
induction that fixes the words of any weight w (and hence 
all of A N ). 

Assume the claim has been established for weights up to 
w — 1. If t has weight w > 1, it is uniquely specified by the 
W words r, which have weight wt(r,) = w — 1 and Hamming 
distance djj(r,, t) = 1 to t (in fact, any two words fj, Tj from 
this set are sufficient to specify t). But since the weights of all 
of the n, and djj(r,, t) are preserved by Tt by the induction 
hypothesis, T* must fix t. □ 

It follows that all transformations in boxworld can be 
formed by composing subsystem permutations and local per- 
mutations. However, the set of allowed local permutations is 
further restricted (Hamming distance preservation is a neces- 
sary but not sufficient condition on T'): 

Lemma 8. The only local reversible operations allowed in 
boxworld are relabellings of measurements and their out- 
comes (separately for each measurement). Furthermore, all 
possible local relabellings are allowed transformations, re- 
gardless of the total number of subsystems, N. 

Proof. Recall <j4j and note that these are the only combina- 
tions of extremal effects that sum to 1: otherwise the iden- 
tity 1 could be decomposed into a sum involving two effects 
X m (k) and X m >(k') with m ^ m', so there would be a state 
s with (X m (k), s) = (X m ,(k'),s) = 1, for which (1, s) > 2, 
a contradiction. 

Consider a measurement to, then 

k fe 

so each member of {T^X m (k) : < k < K(m) - 1} 
must correspond to the same measurement. Hence all local 
reversible adjoint transformations permute the measurements, 
and, for each measurement separately, permute the outcomes. 

To see that all permutations are allowed if the number of 
outcomes K is the same for every measurement to, note that 
the representation of X m (k) as in ® on (R M ® M A ' _1 ) © K 
permits that all those permutations are implemented as al- 
lowed linear (hence orthogonal) transformations: relabelling 
the measurements corresponds to permuting the standard unit 
vectors e m of R M (constituting an il/-dimensional irreducible 
representation of the symmetric group Sm), while relabelling 
the outcomes corresponds to symmetry transformations of the 
(K — l)-simplex in IR*" -1 with K vertices Wk (a (K — 1)- 
dimensional irreducible representation of Sk)- 

In the case that the number of outcomes K = K(m) de- 
pends on the measurement m, the vector space carrying the 
local effects will analogously be 0^~o l^™^ 1 © K. This 
allows us to represent the permutation of outcomes linearly, 
as before, while the permutations of measurements to and m' 
with K(m) = A' (to') correspond to permutations of direct 
summands. 

We have thus proven that every local relabelling transfor- 
mation is an allowed transformation in boxworld. It re- 
mains to show that T' is allowed if the single system is cou- 
pled to others (i.e. that T' ® 1 is an allowed transforma- 
tion). (The analogue in quantum theory is to prove complete 



positivity). This follows from the fact that local relabellings 
preserve the no-signalling, positivity and normalization con- 
straints, such that T <g> 1 maps the no-signalling polytope (that 
is, the state space) onto itself. □ 

In the final part of the appendix, we prove that a state s is a 
pure product state if and only if all the probabilities (A, s) are 
either or 1 with respect to extremal effects, A (in the most 
general case that the number of measurements and outcomes 
varies from site to site). This has been used in the proof of 
Theorem 2. 

Lemma 9. Let s 6 S^ N > be a normalized state on N arbitrary 
boxworld subsystems (some of which may be classical). Then, 
s is a pure product state if and only if {A, s) € {0, 1} for all 
extremal effects A = A x <g> . . . ® A N € pW. 

Proof. If s = Si ® . . . <8 Sjy is a product of pure states, then 
(Ai, Si) € {0, 1} for every i, such that (A, s) is either or 1. 
It remains to prove the converse. Suppose that s is any state 
with (A, s) e {0, 1} for all A <G V^. The idea is to construct 
a pure product state s with (A, s) = (A, s) for all A, which 
proves that s = s. To this end, note that the decomposition 
of the identity given in (|2]i has the following generalization: if 
the to ^ are arbitrary local measurement devices, then 

|W= ]T X mi (k 1 )®X m2 (k 2 )®...®X mN (k N ), (6) 

where the sum is over all outcomes (the number of outcomes 
may depend on the measurement). It follows that 

1= (XmAh) ® X m2 (k 2 ) ® ...®X mN (k N ),s), 

so exactly one of the addends must be 1, while all others are 
0. Hence, to every string m = (mi, to.2, . . . , ton) describ- 
ing local choices of measurements, there is a unique string of 
corresponding outcomes k(m) = (k\, k%, ■ ■ ■ , few) such that 
(X mi (k\ ) ® . . . <£) X mN (fcjv), s ) = 1> while the inner prod- 
uct is for all other outcome combinations. It remains to show 
that each ki only depends on m^; in this case, we can construct 
a state s = Si <£> . . . <E> s~n, generating the same probabilities 
as s, factor by factor. So suppose there were strings m and m 
with rrii = fhi, but ki ^ ki, (where k := k(m)) then 

i = d®...®i®^jt mi (fc')<gii®...®i,s) 

> (!®...®i®x m( (fo)®i®...®i,s) 

+ (1 ® . . . ® 1 ® Xrhi (h) ® I ® • • ■ ® 1, S) 

> (X mi (fci)®...®X mjv (fcjv),s) 

+ {Xm 1 (fa) ® • ■ • ® X rhN (fcjy), S) 

= 2 

where the last inequality follows by applying the decomposi- 
tion of the identity © to N — 1 factors. This is a contradic- 
tion. □ 



